CASIA-PGPS9K: Plane Geometry Problem Solving Dataset

1. Introduction

The Plane Geometry Problem Solving Dataset (PGPS9K) was constructed by the State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Institute of Automation of Chinese Academy of Sciences (CASIA). The samples in PGPS9K are labeled with both fine-grained diagram annotation and interpretable solution program, where the diagram annotation is converted into structural clauses and semantic clauses to effectively describe multi-level information in geometry diagram.

Download: CASIA-PGPS9K.zip (126 MB)

Fig.1 An example of PGPS9K.


2. Collection and Description

PGPS9K is composed of 9,022 geometry problems paired with non-duplicate 4,000 geometry diagrams, where 2,891 problems paired with 1,738 diagrams are selected from Geometry3K dataset, the rest of problems are collected from five popular textbooks across grades 6-12 on mathematics curriculum websites . Our PGPS9K is divided into 30 problem types as exhibited in Fig. 2, covering almost all problem types of plane geometry problem in corresponding grades.

Tab.1 Comparison with existing geometry problem solving datasets. Type, OP and PL represent problem type, operator number and program length, respectively.

Fig.2 Distribution of problem types of PGPS9K dataset.

As shown in Fig. 3, PGPS9K dataset has five properties, which make it focus on the challenges at geometric reasoning and alleviate the bias introduced by the text:

  • Theorem-based: Solving problems in PGPS9K need to apply geometric theorem knowledge to carry out algebraic calculation and get numerical results finally;
  • Diagram-dependent: Above 90% of problems must be solved using the diagrams because necessary conditions such as variable content and geometric structure are displayed via visual form instead of text;
  • Abstract: The diagram is integrated with basic geometric primitives (point, line, circle) and non-geometric primitives (text, symbol). No complex semantic scenarios are involved in textual problem except abstract geometric conditions;
  • Fine-grained: Problems with the same diagram vary in conditions or targets. Slight distinctions in textual problems usually lead to completely different solutions to problems;
  • Condition-redundancy: Lots of conditions in semantic clauses or textual problem are not needed in problem solving at hand. The statistics results show that on average, 1.9 conditions are not used in problem solving, 42% of problems have redundant conditions.
  • Fig.3 More example presentation of PGPS9K dataset.

    Moreover, for convenience of experimental comparison, we split PGPS9K in two ways: The first is leaving out the test set of Geometry3K as test set (589) and other disjoint samples as training set (8,433); The second is dividing samples of each problem type according to ratio of 8:1 (training set 8,022 and test set 1,000).


    3. Annotation Form

    The annotations of PGPS9K include diagram annotation and solution program, where the diagram annotation is to extract structural and semantic information in diagram and the solution program defines the solution steps of problem.

    3.1 Diagram Annotation and Textual Clauses

    Diagram annotation adopts the same primitive level labels as CASIA-PGDP5K dataset which includes primitive contents and primitive relations in tuple form. Then we translate them into two kinds of textual clauses: structural clauses and semantic clauses. The structural clauses are confined to the connection relationship among geometric primitives and described by clauses with points on lines or points on circles, wherein points are arranged in order. The connection relation reveals the most fundamental structural relation displayed in diagram but omitted in textual problem. The semantic clauses depict basic relations between geometric primitives and non-geometric primitives with natural language. These relations are necessary parts for problem solving and complement each other in diagram and textual problem. Tab. 2 displays the complete templates of textual clauses, consisting 3 types of structural clauses and 6 types of semantic clauses. Noting that the definition and descriptive approach of textual clauses remain open and the overall design principle is to characterize complete features of diagram to help with GPS. Our translation code between diagram annotation and textual clauses is here:

    Download: anno2clause.zip (8.3 KB)

    Tab.2 Templates of textual clauses. The symbols of ’&’, ’*’, ’$’, ’%’ denote point, line, variable and angle ID, respectively.

    3.2 Solution Program

    Our solution program gives the geometric solution procedure consisting of several deduction steps. It is composed of 34 operators OP and 55 operands PN, where a operator and a few of related operands form one step. Each operator implies one geometric theorem or axiom wherein operands involved are sorted according to the corresponding theorem formula. Operands can be divided into four types: problem variables N (11) presented in textual problem or semantic clauses, process variables V (7) generated during the process, arguments ARG (26) are alphabetic unknowns [a-z], and constants C (11). For example, the Pythagorean theorem reveals the relationship of right sides and hypotenuse in right triangle with theorem formula a^2+b^2=c^2, so we express it as "Gougu(a, b, c)". Besides, we firstly introduce process variables V as unknown variables in intra-step and as transfer variables in inter-step, unifying the forward and reverse operations within one theorem. For instance, in the Pythagorean theorem, "Gougu(V, *, *)" and "Gougu(*, *, V)" can be set to solve the right side and hypotenuse, respectively.

    Fig.4 Annotation of solution program and its interpretability.

    Tab.3 Program sets defined in solution program.

    It should indicate that our solution program still confront similar issues as general math word problem:

  • Uncertainty of exchangeable operands: The operands in some theorem formulas are commutative, e.g., in Pythagorean theorem with formula a^2+b^2=c^2, the two right edges are exchangeable. In our annotation, we normalize solution program via specifying two-level priority of commutative operands. The first is the class level that "augment > process variable > problem variable > constant" and the second is the index level with positive order.
  • Uncertainty of equivalent step orders: Calculation steps are in no particular order sometimes. We keep the same pre-defined step order for the same problem type manually.
  • Multiple solution methods: A part of geometry problems could be solved by multiple solution methods. We choose the solution method with the most concise solution program.
  • Tab.4 Theorem (axiom) formulas.


    3.3 File Formats

    Fig.5 Format of problem annotation.

    Fig.6 Format of diagram annotation.


    4. Condition of Use

  • The CASIA-PGPS9K: Plane Geometry Problem Solving Dataset, built by CASIA, are released for academic research free of cost under an agreement.
  • Commercial use of the databases is subject to charge. For possible license of commercial use, please contact Fei Yin (fyin@nlpr.ia.ac.cn).
  • The application form of the dataset for academic research can be downloaded bellowing:

          English version

          Chinese version


    Reference

    If this dataset helps you, please cite the papers below:

       [1] Ming-Liang Zhang, Fei Yin and Cheng-Lin Liu. A Multi-Modal Neural Geometric Solver with Textual Clauses Parsed from Diagram, In IJCAI 2023.

       [2] Ming-Liang Zhang, Fei Yin, Yi-Han Hao and Cheng-Lin Liu. Plane Geometry Diagram Parsing, In IJCAI 2022.

       [3] Yihan Hao, Mingliang Zhang, Fei Yin and Lin-Lin Huang. PGDP5K: A Diagram Parsing Dataset for Plane Geometry Problems, In ICPR 2022.


    Contact

    Cheng-Lin Liu (liucl@nlpr.ia.ac.cn), Fei Yin (fyin@nlpr.ia.ac.cn)

    State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS)

    Institute of Automation of Chinese Academy of Sciences

    95 Zhongguancun East Road, Beijing 100190, P.R. China